Validity of evaluation scales for post-stroke depression: a systematic review and meta-analysis

Background Post-stroke depression (PSD) is closely associated with poor stroke prognosis. However, there are some challenges in identifying and assessing PSD. This study aimed to identify scales for PSD diagnosis, assessment, and follow-up that are straightforward, accurate, efficient, and reproducible. Methods A systematic literature search was conducted in 7 electronic databases from January 1985 to December 2023. Results Thirty-two studies were included, the Patient Health Questionnaire-9 (PHQ-9) and Hamilton Depression Scale (HDRS) had higher diagnostic accuracy for PSD. The sensitivity, specificity, and diagnostic odds ratio of PHQ-9 or diagnosing any depression were 0.82, 0.87, and 29 respectively. And for HDRS, used for diagnosing major depression, the scores were 0.92, 0.89, and 94. Furthermore, these two scales also had higher diagnostic accuracy in assessing depressive symptoms during both the acute and chronic phases of stroke. In patients with post-stroke aphasia and cognitive impairment, highly diagnostic scales have not been identified for assessing depressive symptoms yet. Conclusions The PHQ-9 and HDRS scales are recommended to assess PSD. HDRS, which demonstrates high diagnostic performance, can replace structured interviews based on diagnostic criteria.


Introduction
Stroke is a significant cardiovascular disease, with its incidence rate and associated disease risks being of global concern [1].With the increasing incidence of stroke worldwide, the number of people suffering from poststroke depression (PSD) has increased significantly [2].PSD is one of the most common complications after the stroke.The main manifestations are depressive mood and loss of interest, often accompanied by somatic symptoms such as weight loss, insomnia, and fatigue [3,4].PSD seriously hinders the recovery of neurological function in stroke patients, leading to prolonged hospital stays loss of social interaction and independent living skills, and even increased stroke recurrence and mortality [5,6].Therefore, early diagnosis and treatment of PSD are crucial for prognosis.Currently, the diagnosis of PSD is still based on structured interviews [7].Since the pathogenesis of PSD is not entirely clear [8], the dual effects of stroke-induced brain damage and mental stress complicate its diagnosis.Presently, PSD is classified as a mental disorder rather than neurological disorder.For example, in the Diagnostic and Statistical Manual of Mental Disorders-5th Edition (DSM-V), PSD is categorized under depressive disorder due to other physical diseases [7]; In the 10th edition of the International Classification of Mental Disorders (ICD-10), it is classified as an organic mental disorder [9]; Similarly, in the Chinese Classification and Diagnostic Standard of Mental Disorders (CCMD-3), it is regarded as a mental disorder caused by cerebrovascular diseases [10].The diverse diagnostic criteria across to different classification systems further complicate the diagnosis of PSD.Additionally, most of the scales used to assess PSD usually refer to the scales of Major Depressive Disorder (MDD) [4,11].
There are mainly three types of depression scales.Firstly, self-rating scales, such as Patient Health Questionnaire-9 (PHQ-9), Beck Depression Inventory (BDI), and Self-rating Depression Scale (SDS).Secondly, clinician-rated scales, including Hamilton Depression Rating Scale (HDRS) and Montgomery Asberg Depression Rating Scale (MADRS).Thirdly, depression assessment scales for specific populations are Geriatric Depression Screening Scale (GDS) and Stroke Aphasic Depression Questionnaire (SADQ-10).Due to the lack of uniform standards, clinical studies may apply different scales to assess the same PSD populations or use a single scale to assess PSD populations with different characteristics.The validity of these scales varies widely, leading to differences in the epidemiology, diagnosis, and assessment of PSD.Although some research teams have developed PSD-specific scales, such as Post-Stroke Depression Symptom Inventory (PSDS) [12] and Post-Stroke Depression Prediction Scale (DePreS) [13], their validity is still under clinical evaluation and they are not widely used.
Therefore, it is urgent to identify scales that can simplify the diagnostic process of PSD and facilitate the prognosis evaluation.This meta-analysis aimed to select the accurate, simple and reproducible assessment scales for PSD.

Literature search
Through computer retrieval, seven English electronic databases (PubMed, EMBASE, Medline, Web of Science, Clinical trial.gov,CINAHL, and Cochrane library) were searched for published literature on PSD and scale assessment from January 1985 to December 2023.The search scope included title and abstract, and the language was limited to English.According to the Medical Subject Headings (MeSH), the searched keywords include: 1) Post-stroke depression: 'post-stroke depression' or 'post stroke depression' or 'PSD' or 'depression after stroke' or 'emotional disturbances after stroke' or 'emotionalism after stroke' or 'vascular depression' or 'post stroke depressive disorder' or 'depressive disorder after stroke' .2) Assessment: 'assessment scale' or 'validity' or 'measure' or 'measures' or 'evaluation' .

Inclusion and exclusion criteria Inclusion criteria were as follows:
(1) The studies were original studies, including casecontrol and cohort studies with a clearly defined period of development or publication.
(2) The study content involved the use of depression scales to evaluate PSD (3) Participants met the diagnostic criteria for stroke (4) The evaluation of PSD adhered to the relevant classification and diagnostic criteria (DSM, ICD, CCMD) (5) The study needed to provide the number of patients with stroke and PSD.

Exclusion criteria were:
(1) Animal studies related to PSD (2) Lack of clear criteria for the diagnosis of stroke (3) Failure to use the diagnostic criteria for PSD based on structured interviews or assessments (4) Researchers did not adopt scientific data collection methods (5) Inappropriate use of statistical methods in research or errors in data analysis (6) Reviews, systematic reviews, dissertations, conference papers, and repeated publications (7) The literature was not in English.

Data extraction
Firstly, the selected studies in the database were entered into the EndNote X9.3.2 software (Thomson Scientific, America).After screening for duplicate studies, the titles and abstracts of the remaining studies were screened again.Secondly, included studies were identified after reading the full text of each study according to the inclusion and exclusion criteria.The extracted data mainly included: author, publication time, number of cases, assessment scales and cut-offs, PSD diagnostic criteria, type of stroke, onset time of stroke when evaluating depressive symptoms, and type of depression.

Quality evaluation
Two reviewers independently assessed the quality and risk of bias of all included studies using The Risk Of Bias In Non-randomized Studies -of Interventions (ROB-INS-I) [14], Any disagreements between the reviewers were be discussed with the superior expert until a consensus was reached.

Data analysis
The RevMan 5.4 statistical software provided by Cochrane collaboration was used for quality assessment of the data and statistical description.We used Stata15.1 software for meta-analysis and heterogeneity test.In cases where the heterogeneity between studies was P > 0.1 and I 2 < 50%, we employed a fixed-effect model for comprehensive analysis.Conversely, if the heterogeneity between studies was P ≤ 0.1 and I 2 ≥ 50%, the randomeffect model was used.We utilized the bivariate mixedeffects model to assess the diagnostic efficacy of the scale, focusing on key evaluation indicators [15] sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio.Samples of the scales included in the evaluation must meet the criteria of the bivariate mixed-effects model analysis, with a minimum sample size of 3 (n ≥ 3).
Subgroup analysis can be divided into three subgroups: (1) Depression type, which was divided into any depression group and major depression group.Major depression was defined according to the diagnosis of MDD in DSM-V [7]: Patients were required to have five or more of nine depressive symptoms lasting more than two weeks after the stroke event, and at least one of them was 1) mood depression or 2) loss of interest or pleasure.The definition of any depression was broader, according to the depressive disorder definition in DSM-III [16], encompassing adjustment disorder with depressive mood, disorder, and dysthymia.(2) Stroke staging, which was divided into acute phase after stroke (≤ 2 months) and chronic phase after stroke (> 2 months).( 3) Specific populations, it includes patients with certain characteristics, such as a comorbid history of pre-stroke depression, stroke with aphasia, cognitive dysfunction, and other features.

Results
This study followed the PRISMA guidelines on reporting [17].The screening flowchart was shown in Fig. 1.Thirty-two studies [12,13, involving 3865 people aged between 18 and 92 were included.The relevant information from the studies was presented in Table 1.The ROBINS-I was used to evaluate the quality of the included literature.The evaluation results were presented in Fig. 2 and Fig. 3.

Meta-analysis of scale selection
Sensitivity and specificity of the scales were assessed when the number of articles involved in each scale was two or more (n ≥ 2).The study assessed ten scales (PHQ-9, HDRS, MADRS, BDI, GDS, HADS-D, PHQ-2, CES-D, HADS, and PSDS) involving 28 articles.These ten scales had different sensitivities and specificities, and the same scale had different sensitivities and specificities in different studies (Fig. 4).

Subgroup analysis Depression type
Any depression Five scales were used to assess PSD when depression was classified as any depression in the study.Overall, PHQ-9 had high diagnostic efficacy when both sensitivity and specificity were considered, with a sensitivity of 0.82 (95%CI: 0.72-0.89),specificity 0.87 (95%CI: 0.68-0.95), and diagnostic odds ratio 29 (95%CI: 10.0-84.0);If only higher sensitivity was required, HDRS and MADRS were more advantageous.However, when only higher specificity was considered, PHQ-9 and HADS-D were more advantageous (Table 2).
Major depression When classifying depression as major depression, six scales were used to assess PSD.Overall, when the sensitivity and specificity were considered together, HDRS had a high diagnostic power, with a sensitivity of 0.92 (95%CI: 0.82-0.97),specificity of 0.89 (95%CI: 0.84-0.92),and diagnostic odds ratio of 94 (95%CI: 32-281); Likewise, if only the sensitivity was considered, BDI, HDRS, MADRS had the advantage; but for higher specificity, PHQ-9 and PHQ-2 had the advantage (Table 3).

Staging of stroke
Acute phase after stroke A total of three scales were used to assess PSD in the acute phase of stroke.PHQ-9 had high diagnostic performance when both sensitivity and specificity were considered, with a sensitivity of 0.85 (95%CI: 0.78-0.91),specificity of 0.90 (95%CI: 0.82-0.95),diagnostic odds ratio of 55 (95%CI: 30-102); If only higher sensitivity was considered, MADRS was more favorable, and if only higher specificity was considered, PHQ-9 was more favorable (Table 4).
Chronic phase after stroke There were eight scales to assess PSD in the chronic phase of stroke.Overall, when high sensitivity and specificity were considered together, HDRS had high diagnostic power, with a sensitivity of   5).

Specific populations
For analysis the specific populations for PSD, 9 out of 32 studies compared the baseline data characteristics of depressed and nondepressed patients after stroke.According to the previous and included data in this study, a total of seven specific populations were analyzed, with clinical features including cognitive impairment, severe aphasia, pre-onset antidepressant medication, first stroke, severity of neurological deficit, educational level, and previous psychiatric history (Table 6).However, due to the different inclusion and exclusion criteria and priorities among the original studies, the included data were insufficient, and effective statistical analysis could not be performed.

Discussion
Thirty-two studies were analyzed to determine the best assessment scale for PSD.The results showed that each of these scales (PHQ-9, HDRS, MADRS, BDI, PHQ-2, CES-D, and HADS-D) had different degrees of advantage in diagnosing PSD based on depression type and stroke staging.When evaluating PSD, PHQ-9 exhibits higher diagnostic efficacy for any depression and acute phase after stroke compared to other scales.Conversely, HDRS performs better for major depression and chronic phase after stroke.Due to limitations in the data included in the literature, no effective scale has been found yet to      accurately assess PSD patients with combined aphasia and cognitive impairments.Currently, many studies utilize depression assessment scales for diagnosing PSD.However, controversy remains, as some studies suggest that these scales are not suitable for diagnosing PSD but rather for assessing the severity of depressive symptoms, treatment efficacy, or prognosis [48,49].Whether a scale can substitute for structured interviews in diagnosing PSD depends on its diagnostic accuracy.Our analysis revealed that PHQ-9 and HDRS performed excellently in identifying depressive symptoms and severity.The PHQ-9 is a self-rating scale consisting of 9 items with high sensitivity and specificity [50,51].It has been widely used in screening of PSD, because of its simplicity, less time-consuming, and low requirements for patient cooperation.HDRS, introduced in 1960, comprises seven categories, including items for somatic symptoms [52].It is well known that in the chronic phase of stroke, many patients experience atypical depressive symptoms, such as gastrointestinal symptoms, weight loss, general pain, fatigue, and other physical discomforts [53].HDRS can be used to assess these patients more accurately.Additionally, studies have shown that HDRS is not only uesd to evaluate the severity of PSD, but also to assess the efficacy of antidepressant treatment [54,55].
Burton conducted a review of the scales used for screening post-stroke mood disorders in 2015 [56].They focus on mood disorders after stoke, which include various emotions, such as major depression, any degree of depression, or anxiety.Meader also conducted a related meta-analysis in 2014, which included 24 studies involving 2907 patients [57], the results showed that many scales could screen the PSD, such as CESD, HDRS, and PHQ-9.However, these scales should not be used alone but should be combined with detailed clinical assessments.In comparison to Burton's and Meader's studies, our study included thirty-two studies, and we provided a clearer description of the stage of stroke and the type of depression for PSD.Additionally, we discussed the selection of scales for PSD in special populations and analyzed the prevalence of PSD.
For the staging of stroke, there is still no unified conclusion at present, and the duration of stroke will affect the symptoms of PSD [58,59].Some studies recommend assessing PSD at 2 or 8 weeks after stroke, and Toso 's study found that PSD most occurred within 3 months after stroke [60].In our study, stroke was staged into the acute phase (within 2 months of stroke onset) and chronic phase (2 months after stroke onset).According

Table 6 Scale selection for specific populations of post-stroke depression
The data are all data of patients diagnosed with post-stroke depression, "/" not mentioned in the original study, "Exclude" the original study has been excluded, "Yes" the patients included in the original study are all first-time stroke patients, "NO" Not all patients included in the original study were first-time stroke patients, Educational level high school level or above, MMSE Mini-mental state examination, NIHSS National institutes of health stroke scale, PHQ-9 Patient health questionnaire-9, CES-D Center for epidemiological studies-depression, PHQ-2 The patient health questionnaire- to the severity of depression, Robinson classified PSD into mild PSD (mild depression) and severe PSD (severe depression).Mild PSD corresponds to dysthymia in DSM-III, while severe PSD meets the diagnostic criteria for MDD [61].Therefore, in this study, PSD was divided into two groups: any depression and major depression, and it should be emphasized that any depression included major depression and mild depression.This study aimed to analyze which scale was more effective in identifying and assessing depressive symptoms in the specific population with PSD.However, due to the different inclusion and exclusion criteria and priorities among the original studies, the included data were insufficient, and effective statistical analysis could not be performed.Stroke patients often experience complications such as aphasia and cognitive dysfunction, which can exacerbate PSD.A related study found that post-stroke aphasia patients are more likely to suffer from depression than nonaphasia patients [62].According to a systematic review by Mariska, there was insufficient evidence supporting the use of a specific scale to evaluate the depressive symptoms in aphasia patients, and the evidence level of existing studies was relatively low [63].In addition, relevant studies have shown that post-stroke cognitive impairment (PSCI) was closely related to the occurrence of PSD [64,65].Impairment oognitive function can affect the evaluation of depressive symptoms to varying degrees.At present, cognitive function scales based on the assessment of Alzheimer's disease are often used in clinical work to assess PSCI, such as Mini-Mental State Examination (MMSE), Montreal Cognitive Assessment Scale (MoCA), and Cambridge Geriatric Cognitive Scale (CAMCOG).However, the organic damage of cerebral parenchyma in stroke patients, along with complications such as aphasia, visual impairment, dyslexia, and limb dysfunction, can impose limitations in the evaluation of PSCI using the aforementioned scales [66,67].Hence, further research is warranted to determine the most suitable scales for assessing depressive symptoms in patients with post-stroke aphasia and cognitive impairment.
The results of the study revealed that the prevalence of PSD, determined through standard structured interviews, ranged from 17.0% to 29.0%.Previous studies by Ayerbe and Hackett indicated that approximately onethird of stroke patients experienced varying degrees of depression within five years after the stroke event [68][69][70].It is important to note that the assessment of prevalence was primarily conducted using depression scales.Many factors affect the prevalence of the PSD, such as the population, time, and place of assessment.Nowadays, there is a divergence of opinions regarding whether the timing of PSD assessment influences the prevalence of depression.Some studies showed that the prevalence of depression in the acute phase after stroke was higher than in the chronic phase, and the prevalence gradually decreases over time [71][72][73], However, another study found no difference in the prevalence of PSD in the early, middle, and late stages of stroke [74].Therefore, more high-quality prospective studies will be needed in the future to clarify this issue.

Limitations
There are also some limitations in this study [1].This study was a secondary analysis, and the included studies exhibited significant heterogeneity due to variations in diagnostic thresholds for each scale.Additionally, the optimal diagnostic cut-off of each scale was not analyzed, so it needs to clarify in future studies [2].Data limitations and mismatches between the original studies hindered subgroup analyses of scale selection, thereby preventing adequate analyses for different types and severity of stroke, aphasia population, the elderly population, individuals with a history of depression, and other populations.In the future, developing more comprehensive research protocols for PSD is crucial.

Conclusion
In conclusion, there are various scales to evaluate PSD.To improve diagnostic effectiveness, a variety of scales can be used for dynamic, multi-directional evaluation and follow-up.The PHQ-9 and HDRS are recommended for the evaluation PSD due to their high diagnostic efficiency.Structured interviews based on diagnostic criteria can determine whether stroke patients have depressive symptoms, and depression scales can further determine the severity of symptoms.It is recommended to replace the structured interviews based on diagnostic criteria with rating scales, such as HDRS, with high diagnostic efficacy.Currently, there is still a lack of depression scales for evaluating patients with post-stroke aphasia and cognitive dysfunction.

Fig. 1 Fig. 2
Fig. 1 The flow chart of literature screening

Fig. 3
Fig. 3 Summary plot of risk of bias and fitness items

AIS
Acute ischemic stroke, ICH Acute cerebral haemorrhage, TIA Transient ischemic attack, DSM-V Diagnostic and statistical manual of mental disorders, ICD-10 International classification of mental disorders, SCID Structured clinical interview for DSM, MINI Mini-international neuropsychiatric interview, "Unclear" the specific type of stroke or onset time in the included population was unknown, "/" No clearly data mentioned in the original study, CDIS Computerized version of the national institute of mental health diagnostic interview schedule, ADRS Aphasic depression rating scale, BDI Beck depression inventory, CES-D Center for epidemiological studies-depression, DePreS Post-stroke depression prediction scale, GDS Geriatric depression screening scale,HADS Hospital anxiety and depression scale, HADS-D Hospital anxiety and depression scale-depression, HDRS Hamilton depression scale, MADRS Montgomery asberg depression rating scale, PHQ-2 The patient health questionnaire-2, PHQ-9 Patient health questionnaire-9, PSDRS Post stroke depression rating scale, PSDS Post-stroke depression symptom inventory

Fig. 5
Fig. 5 Prevalence of post-stroke depression in different stroke periods and depression types (forest plots)

Table 1
General information about the included literature

Table 2
Validity analysis of the scale to assess post-stroke depression with any depression PHQ-9 Patient health questionnaire-9, MADRS Montgomery asberg depression rating scale, HDRS Hamilton depression scale, HADS-D Hospital anxiety and depression scale-depression, GDS Geriatric depression screening scale

Table 3
Validity analysis of the scale to assess post-stroke depression with major depression PHQ-9 Patient health questionnaire-9, PHQ-2 The patient health questionnaire-2, MADRS Montgomery asberg depression rating scale, HDRS Hamilton depression scale, HADS-D Hospital anxiety and depression scale-depression, BDI Beck depression inventory

Table 4
Validity analysis of the scale to assess post-stroke depression in the acute phase of stroke Table5 Validity analysis of the scale to assess post-stroke depression in the chronic phase of stroke PHQ-9 Patient health questionnaire-9, MADRS Montgomery asberg depression rating scale, HDRS Hamilton depression scale PHQ-9 Patient health questionnaire-9, PHQ-2 The patient health questionnaire-2, MADRS Montgomery asberg depression rating scale, HDRS Hamilton depression scale, HADS-D Hospital anxiety and depression scale-depression, GDS Geriatric depression screening scale, CES-D Center for epidemiological studies-depression, BDI Beck depression inventory

Table 7
2, HADS-D Hospital anxiety and depression scaledepression, GDS-15 Geriatric depression screening scale-15, PSDS Post-stroke depression symptom inventory, JSS-D Japan stroke scale-depression scale Prevalence of post-stroke depression in different stroke periods and depression types